HAUCA Curves for the Evaluation of Biomarker Pilot Studies with Small Sample Sizes and Large Numbers of Features

نویسندگان

  • Frank Klawonn
  • Junxi Wang
  • Ina Koch
  • Jörg Eberhard
  • Mohamed Omar
چکیده

Biomarker studies often try to identify a combination of measured attributes to support the diagnosis of a specific disease. Measured values are commonly gained from high-throughput technologies like next generation sequencing leading to an abundance of biomarker candidates compared to the often very small sample size. Here we use an example with more than 50,000 biomarker candidates that we want to evaluate based on a sample of only 24 patients. This seems to be an impossible task and finding purely random-based correlations is guaranteed. Although we cannot identify specific biomarkers in such small pilot studies with purely statistical methods, one can still derive whether there are more biomarkers showing a high correlation with the disease under consideration than one would expect in a setting where correlations are purely random. We propose a method based on area under the ROC curve (AUC) values that indicates how much correlations of the biomarkers with the disease of interest exceed pure random effects. We also provide estimations of sample sizes for follow-up studies to actually identify concrete biomarkers and build classifiers for the disease. We also describe how our method can be extended to other performance measures than AUC.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Feature Selection for Small Sample Sets with High Dimensional Data Using Heuristic Hybrid Approach

Feature selection can significantly be decisive when analyzing high dimensional data, especially with a small number of samples. Feature extraction methods do not have decent performance in these conditions. With small sample sets and high dimensional data, exploring a large search space and learning from insufficient samples becomes extremely hard. As a result, neural networks and clustering a...

متن کامل

Evaluation of Updating Methods in Building Blocks Dataset

With the increasing use of spatial data in daily life, the production of this data from diverse information sources with different precision and scales has grown widely. Generating new data requires a great deal of time and money. Therefore, one solution is to reduce costs is to update the old data at different scales using new data (produced on a similar scale). One approach to updating data i...

متن کامل

Experimental Study of Sable Crack Growth in Thin Aluminium Sheet

Recent failure information from research teams in NASA Langley and others has shown that CTOA based fracture models calibrated on large C(T) and M(T) specimens can be transferred successfully to cracked aircraft fuselage structures for the assessment of their residual strength. A major difficulty that could limit the more extensive use of this failure parameter is its experimental measurement e...

متن کامل

Dosimetric Evaluation of Linac Photon Small Fields using MAGIC Polymer Gels

Introduction: In radiotherapy, methods of treatment planning are becoming increasingly more complicated. This requires verification of the doses delivered to increasingly smaller and more precise regions. Radiotherapy techniques are continuously employing smaller and smaller field sizes to deliver tighter radiation doses with higher therapeutic ratios, generating interest among researchers to p...

متن کامل

Evaluation of Full scatter convolution algorithm based Treatment Planning System performance in the presence of inhomogeneities using three-dimensional film dosimetry

Introduction: Inclusion of inhomogeneities such as air-filled cavities in the head and neck treatment fields may result in potential dosimetric disagreement which was caused by electronic disequilibrium. Most of treatments planning systems (TPS) are not able to predict dose distribution of inhomogeneous regions accurately. EBT2 films are used frequently in radiotherapy quality ass...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016